Skip to content

Conversation

savannahostrowski
Copy link
Member

/Users/runner/work/cpython/cpython/Tools/jit/trampoline.c:13:13: warning: 'visibility' attribute ignored [-Wignored-attributes]
   13 |     typedef DECLARE_TARGET((*jit_func));
      |             ^
/Users/runner/work/cpython/cpython/Tools/jit/jit.h:11:49: note: expanded from macro 'DECLARE_TARGET'
   11 |     _Py_CODEUNIT *__attribute__((preserve_none, visibility("hidden"))) \
      |

This was introduced in #136528. The DECLARE_TARGET macro includes visibility("hidden"), which is valid for forward declarations but gets ignored when used in a typedef, so the compiler is yelling.

@savannahostrowski savannahostrowski changed the title Fix compiler warning from misusing DECLARE_TARGET in typedef JIT: Fix compiler warning from visibility attribute in typedef Oct 12, 2025
Copy link
Member

@chris-eibl chris-eibl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Basically the same as my PR #140286 I've just closed since @Fidget-Spinner made me aware of yours :)

Just a suggestion:

@chris-eibl
Copy link
Member

Sorry for being such a pain, this is now merely for my understanding and I hope asking this question here is ok:

When playing with the above Godbolt link, I more or less get the same nice output for jit_stencils-x86_64-unknown-linux-gnu.h in WSL:

void
emit_trampoline(
    unsigned char *code, unsigned char *data, _PyExecutorObject *executor,
    const _PyUOpInstruction *instruction, jit_state *state)
{
    // 
    // trampoline.o:  file format elf64-x86-64
    // 
    // Disassembly of section .text:
    // 
    // 0000000000000000 <_JIT_ENTRY>:
    // 0: 41 57                         pushq   %r15
    // 2: 41 56                         pushq   %r14
    // 4: 41 55                         pushq   %r13
    // 6: 41 54                         pushq   %r12
    // 8: 53                            pushq   %rbx
    // 9: 49 89 f4                      movq    %rsi, %r12
    // c: 49 89 d5                      movq    %rdx, %r13
    // f: 49 89 ce                      movq    %rcx, %r14
    // 12: ff 57 70                      callq   *0x70(%rdi)
    // 15: 5b                            popq    %rbx
    // 16: 41 5c                         popq    %r12
    // 18: 41 5d                         popq    %r13
    // 1a: 41 5e                         popq    %r14
    // 1c: 41 5f                         popq    %r15
    // 1e: c3                            retq
    const unsigned char code_body[31] = {
        0x41, 0x57, 0x41, 0x56, 0x41, 0x55, 0x41, 0x54,
        0x53, 0x49, 0x89, 0xf4, 0x49, 0x89, 0xd5, 0x49,
        0x89, 0xce, 0xff, 0x57, 0x70, 0x5b, 0x41, 0x5c,
        0x41, 0x5d, 0x41, 0x5e, 0x41, 0x5f, 0xc3,
    };
    memcpy(code, code_body, sizeof(code_body));
}

whereas on Windows x86_64 I get

void
emit_trampoline(
    unsigned char *code, unsigned char *data, _PyExecutorObject *executor,
    const _PyUOpInstruction *instruction, jit_state *state)
{
    // 
    // trampoline.o:       file format coff-x86-64
    // 
    // Disassembly of section .text:
    // 
    // 0000000000000000 <_JIT_ENTRY>:
    // 0: 41 57                         pushq   %r15
    // 2: 41 56                         pushq   %r14
    // 4: 41 55                         pushq   %r13
    // 6: 41 54                         pushq   %r12
    // 8: 56                            pushq   %rsi
    // 9: 57                            pushq   %rdi
    // a: 53                            pushq   %rbx
    // b: 48 81 ec a0 00 00 00          subq    $0xa0, %rsp
    // 12: 44 0f 29 bc 24 90 00 00 00    movaps  %xmm15, 0x90(%rsp)
    // 1b: 44 0f 29 b4 24 80 00 00 00    movaps  %xmm14, 0x80(%rsp)
    // 24: 44 0f 29 6c 24 70             movaps  %xmm13, 0x70(%rsp)
    // 2a: 44 0f 29 64 24 60             movaps  %xmm12, 0x60(%rsp)
    // 30: 44 0f 29 5c 24 50             movaps  %xmm11, 0x50(%rsp)
    // 36: 44 0f 29 54 24 40             movaps  %xmm10, 0x40(%rsp)
    // 3c: 44 0f 29 4c 24 30             movaps  %xmm9, 0x30(%rsp)
    // 42: 44 0f 29 44 24 20             movaps  %xmm8, 0x20(%rsp)
    // 48: 0f 29 7c 24 10                movaps  %xmm7, 0x10(%rsp)
    // 4d: 0f 29 34 24                   movaps  %xmm6, (%rsp)
    // 51: 49 89 d4                      movq    %rdx, %r12
    // 54: 4d 89 c5                      movq    %r8, %r13
    // 57: 4d 89 ce                      movq    %r9, %r14
    // 5a: ff 51 70                      callq   *0x70(%rcx)
    // 5d: 0f 28 34 24                   movaps  (%rsp), %xmm6
    // 61: 0f 28 7c 24 10                movaps  0x10(%rsp), %xmm7
    // 66: 44 0f 28 44 24 20             movaps  0x20(%rsp), %xmm8
    // 6c: 44 0f 28 4c 24 30             movaps  0x30(%rsp), %xmm9
    // 72: 44 0f 28 54 24 40             movaps  0x40(%rsp), %xmm10
    // 78: 44 0f 28 5c 24 50             movaps  0x50(%rsp), %xmm11
    // 7e: 44 0f 28 64 24 60             movaps  0x60(%rsp), %xmm12
    // 84: 44 0f 28 6c 24 70             movaps  0x70(%rsp), %xmm13
    // 8a: 44 0f 28 b4 24 80 00 00 00    movaps  0x80(%rsp), %xmm14
    // 93: 44 0f 28 bc 24 90 00 00 00    movaps  0x90(%rsp), %xmm15
    // 9c: 48 81 c4 a0 00 00 00          addq    $0xa0, %rsp
    // a3: 5b                            popq    %rbx
    // a4: 5f                            popq    %rdi
    // a5: 5e                            popq    %rsi
    // a6: 41 5c                         popq    %r12
    // a8: 41 5d                         popq    %r13
    // aa: 41 5e                         popq    %r14
    // ac: 41 5f                         popq    %r15
    // ae: c3                            retq
    const unsigned char code_body[175] = {
        0x41, 0x57, 0x41, 0x56, 0x41, 0x55, 0x41, 0x54,
        0x56, 0x57, 0x53, 0x48, 0x81, 0xec, 0xa0, 0x00,
        0x00, 0x00, 0x44, 0x0f, 0x29, 0xbc, 0x24, 0x90,
        0x00, 0x00, 0x00, 0x44, 0x0f, 0x29, 0xb4, 0x24,
        0x80, 0x00, 0x00, 0x00, 0x44, 0x0f, 0x29, 0x6c,
        0x24, 0x70, 0x44, 0x0f, 0x29, 0x64, 0x24, 0x60,
        0x44, 0x0f, 0x29, 0x5c, 0x24, 0x50, 0x44, 0x0f,
        0x29, 0x54, 0x24, 0x40, 0x44, 0x0f, 0x29, 0x4c,
        0x24, 0x30, 0x44, 0x0f, 0x29, 0x44, 0x24, 0x20,
        0x0f, 0x29, 0x7c, 0x24, 0x10, 0x0f, 0x29, 0x34,
        0x24, 0x49, 0x89, 0xd4, 0x4d, 0x89, 0xc5, 0x4d,
        0x89, 0xce, 0xff, 0x51, 0x70, 0x0f, 0x28, 0x34,
        0x24, 0x0f, 0x28, 0x7c, 0x24, 0x10, 0x44, 0x0f,
        0x28, 0x44, 0x24, 0x20, 0x44, 0x0f, 0x28, 0x4c,
        0x24, 0x30, 0x44, 0x0f, 0x28, 0x54, 0x24, 0x40,
        0x44, 0x0f, 0x28, 0x5c, 0x24, 0x50, 0x44, 0x0f,
        0x28, 0x64, 0x24, 0x60, 0x44, 0x0f, 0x28, 0x6c,
        0x24, 0x70, 0x44, 0x0f, 0x28, 0xb4, 0x24, 0x80,
        0x00, 0x00, 0x00, 0x44, 0x0f, 0x28, 0xbc, 0x24,
        0x90, 0x00, 0x00, 0x00, 0x48, 0x81, 0xc4, 0xa0,
        0x00, 0x00, 0x00, 0x5b, 0x5f, 0x5e, 0x41, 0x5c,
        0x41, 0x5d, 0x41, 0x5e, 0x41, 0x5f, 0xc3,
    };
    // 0: '\x00\x00\x00\x00'
    // 4: 00 00 00 00
    memcpy(code, code_body, sizeof(code_body));
}

which tells me, clang-cl is saving a ton more on the stack here.
Why that big difference? I wouldn't have assumed that from reading https://clang.llvm.org/docs/AttributeReference.html#preserve-none

On X86-64, only RSP and RBP are preserved by the callee. Registers R12, R13, R14, R15, RDI, RSI, RDX, RCX, R8, R9, R11, and RAX now can be used to pass function arguments. Floating-point registers (XMMs/YMMs) still follow the C calling convention.

@chris-eibl
Copy link
Member

Oh, I think I've found the answer myself: adding __attribute__((ms_abi)) to the _JIT_ENTRY function results in similar code as in the stencil, so it is just following MS ABI ...

@chris-eibl
Copy link
Member

Maybe also restore the comment

 // Note that this is *not* a tail call:

that got lost in #137961? Because it is utterly important that we use jit_func_preserve_none here and not jit_func, because this would result in a tail call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants